Regularized Boost for Semi-Supervised Learning

نویسندگان

  • Ke Chen
  • Shihai Wang
چکیده

Semi-supervised inductive learning concerns how to learn a decision rule from a data set containing both labeled and unlabeled data. Several boosting algorithms have been extended to semi-supervised learning with various strategies. To our knowledge, however, none of them takes local smoothness constraints among data into account during ensemble learning. In this paper, we introduce a local smoothness regularizer to semi-supervised boosting algorithms based on the universal optimization framework of margin cost functionals. Our regularizer is applicable to existing semi-supervised boosting algorithms to improve their generalization and speed up their training. Comparative results on synthetic, benchmark and real world tasks demonstrate the effectiveness of our local smoothness regularizer. We discuss relevant issues and relate our regularizer to previous work.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-supervised Learning with Regularized Laplacian

We study a semi-supervised learning method based on the similarity graph and Regularized Laplacian. We give convenient optimization formulation of the Regularized Laplacian method and establish its various properties. In particular, we show that the kernel of the method can be interpreted in terms of discrete and continuous time random walks and possesses several important properties of proximi...

متن کامل

Deceptive Review Spam Detection via Exploiting Task Relatedness and Unlabeled Data

Existing work on detecting deceptive reviews primarily focuses on feature engineering and applies off-the-shelf supervised classification algorithms to the problem. Then, one real challenge would be to manually recognize plentiful ground truth spam review data for model building, which is rather difficult and often requires domain expertise in practice. In this paper, we propose to exploit the ...

متن کامل

Efficient and Robust Semi-supervised Learning Over a Sparse-Regularized Graph

Graph-based Semi-Supervised Learning (GSSL) has limitations in widespread applicability due to its computationally prohibitive large-scale inference, sensitivity to data incompleteness, and incapability on handling time-evolving characteristics in an open set. To address these issues, we propose a novel GSSL based on a batch of informative beacons with sparsity appropriately harnessed, rather t...

متن کامل

Linear Manifold Regularization for Large Scale Semi-supervised Learning

The enormous wealth of unlabeled data in many applications of machine learning is beginning to pose challenges to the designers of semi-supervised learning methods. We are interested in developing linear classification algorithms to efficiently learn from massive partially labeled datasets. In this paper, we propose Linear Laplacian Support Vector Machines and Linear Laplacian Regularized Least...

متن کامل

Regularized factor models

This dissertation explores regularized factor models as a simple unification of machine learning problems, with a focus on algorithmic development within this known formalism. The main contributions are (1) the development of generic, efficient algorithms for a subclass of regularized factorizations and (2) new unifications that facilitate application of these algorithms to problems previously ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007